Get Data

This is nearly entirely based on the code in notebook 09 and that in 11.

We have latent variable expression analysis data - Latent Variable Table

For this data we are also using any data for which there are gene variants (cNFs, pNFs, MPNSTs): - Exome-Seq variants - WGS Variants

Let’s see if there are any LVs that split based on gene variant. Because we’re having trouble scaling with the number of latent variables, I only look at variants that occur in less than 5% of the population. notice this is a difference from notebook #11.

wgs.vars=synTableQuery("SELECT Hugo_Symbol,Protein_position,specimenID,IMPACT,FILTER,ExAC_AF,gnomAD_AF FROM syn20551862")$asDataFrame()
## 
Building the CSV... [###-----------------]14.92%   211722/1419406       
Building the CSV... [#####---------------]22.73%   322605/1419406       
Building the CSV... [########------------]37.98%   539093/1419406       
Building the CSV... [#########-----------]45.72%   648999/1419406       
Building the CSV... [####################]100.00%   1419406/1419406   Done...    
Downloading  [#-------------------]3.11%   2.0MB/64.3MB (2.6MB/s) Job-103127953440679380505205262.csv     
Downloading  [#-------------------]6.22%   4.0MB/64.3MB (3.4MB/s) Job-103127953440679380505205262.csv     
Downloading  [##------------------]9.33%   6.0MB/64.3MB (3.9MB/s) Job-103127953440679380505205262.csv     
Downloading  [##------------------]12.44%   8.0MB/64.3MB (4.4MB/s) Job-103127953440679380505205262.csv     
Downloading  [###-----------------]15.55%   10.0MB/64.3MB (4.9MB/s) Job-103127953440679380505205262.csv     
Downloading  [####----------------]18.66%   12.0MB/64.3MB (5.3MB/s) Job-103127953440679380505205262.csv     
Downloading  [####----------------]21.77%   14.0MB/64.3MB (5.6MB/s) Job-103127953440679380505205262.csv     
Downloading  [#####---------------]24.88%   16.0MB/64.3MB (6.0MB/s) Job-103127953440679380505205262.csv     
Downloading  [######--------------]27.99%   18.0MB/64.3MB (6.2MB/s) Job-103127953440679380505205262.csv     
Downloading  [######--------------]31.10%   20.0MB/64.3MB (6.5MB/s) Job-103127953440679380505205262.csv     
Downloading  [#######-------------]34.21%   22.0MB/64.3MB (6.7MB/s) Job-103127953440679380505205262.csv     
Downloading  [#######-------------]37.32%   24.0MB/64.3MB (6.9MB/s) Job-103127953440679380505205262.csv     
Downloading  [########------------]40.43%   26.0MB/64.3MB (7.0MB/s) Job-103127953440679380505205262.csv     
Downloading  [#########-----------]43.54%   28.0MB/64.3MB (7.2MB/s) Job-103127953440679380505205262.csv     
Downloading  [#########-----------]46.65%   30.0MB/64.3MB (7.3MB/s) Job-103127953440679380505205262.csv     
Downloading  [##########----------]49.76%   32.0MB/64.3MB (7.4MB/s) Job-103127953440679380505205262.csv     
Downloading  [###########---------]52.88%   34.0MB/64.3MB (7.6MB/s) Job-103127953440679380505205262.csv     
Downloading  [###########---------]55.99%   36.0MB/64.3MB (7.7MB/s) Job-103127953440679380505205262.csv     
Downloading  [############--------]59.10%   38.0MB/64.3MB (7.8MB/s) Job-103127953440679380505205262.csv     
Downloading  [############--------]62.21%   40.0MB/64.3MB (7.9MB/s) Job-103127953440679380505205262.csv     
Downloading  [#############-------]65.32%   42.0MB/64.3MB (8.0MB/s) Job-103127953440679380505205262.csv     
Downloading  [##############------]68.43%   44.0MB/64.3MB (8.1MB/s) Job-103127953440679380505205262.csv     
Downloading  [##############------]71.54%   46.0MB/64.3MB (8.1MB/s) Job-103127953440679380505205262.csv     
Downloading  [###############-----]74.65%   48.0MB/64.3MB (8.2MB/s) Job-103127953440679380505205262.csv     
Downloading  [################----]77.76%   50.0MB/64.3MB (8.3MB/s) Job-103127953440679380505205262.csv     
Downloading  [################----]80.87%   52.0MB/64.3MB (8.3MB/s) Job-103127953440679380505205262.csv     
Downloading  [#################---]83.98%   54.0MB/64.3MB (8.4MB/s) Job-103127953440679380505205262.csv     
Downloading  [#################---]87.09%   56.0MB/64.3MB (8.5MB/s) Job-103127953440679380505205262.csv     
Downloading  [##################--]90.20%   58.0MB/64.3MB (8.5MB/s) Job-103127953440679380505205262.csv     
Downloading  [###################-]93.31%   60.0MB/64.3MB (8.6MB/s) Job-103127953440679380505205262.csv     
Downloading  [###################-]96.42%   62.0MB/64.3MB (8.6MB/s) Job-103127953440679380505205262.csv     
Downloading  [####################]99.53%   64.0MB/64.3MB (8.6MB/s) Job-103127953440679380505205262.csv     
Downloading  [####################]100.00%   64.3MB/64.3MB (8.7MB/s) Job-103127953440679380505205262.csv Done...
exome.vars=synTableQuery("SELECT Hugo_Symbol,Protein_position,specimenID,IMPACT,FILTER,ExAC_AF,gnomAD_AF FROM syn20554939")$asDataFrame()
## 
Building the CSV... [#-------------------]6.11%   117080/1916686       
Building the CSV... [####----------------]19.70%   377679/1916686       
Building the CSV... [#####---------------]26.16%   501461/1916686       
Building the CSV... [########------------]38.75%   742706/1916686       
Create CSV FileHandle [##########----------]50.00%   958348/1916686       
Create CSV FileHandle [####################]100.00%   1916686/1916686   Done...    
Downloading  [--------------------]2.42%   2.0MB/82.5MB (3.4MB/s) Job-10312797222738272612269072.csv     
Downloading  [#-------------------]4.85%   4.0MB/82.5MB (4.8MB/s) Job-10312797222738272612269072.csv     
Downloading  [#-------------------]7.27%   6.0MB/82.5MB (5.7MB/s) Job-10312797222738272612269072.csv     
Downloading  [##------------------]9.69%   8.0MB/82.5MB (6.4MB/s) Job-10312797222738272612269072.csv     
Downloading  [##------------------]12.11%   10.0MB/82.5MB (6.8MB/s) Job-10312797222738272612269072.csv     
Downloading  [###-----------------]14.54%   12.0MB/82.5MB (7.2MB/s) Job-10312797222738272612269072.csv     
Downloading  [###-----------------]16.96%   14.0MB/82.5MB (7.5MB/s) Job-10312797222738272612269072.csv     
Downloading  [####----------------]19.38%   16.0MB/82.5MB (7.9MB/s) Job-10312797222738272612269072.csv     
Downloading  [####----------------]21.81%   18.0MB/82.5MB (8.2MB/s) Job-10312797222738272612269072.csv     
Downloading  [#####---------------]24.23%   20.0MB/82.5MB (8.4MB/s) Job-10312797222738272612269072.csv     
Downloading  [#####---------------]26.65%   22.0MB/82.5MB (8.7MB/s) Job-10312797222738272612269072.csv     
Downloading  [######--------------]29.07%   24.0MB/82.5MB (8.8MB/s) Job-10312797222738272612269072.csv     
Downloading  [######--------------]31.50%   26.0MB/82.5MB (9.1MB/s) Job-10312797222738272612269072.csv     
Downloading  [#######-------------]33.92%   28.0MB/82.5MB (9.3MB/s) Job-10312797222738272612269072.csv     
Downloading  [#######-------------]36.34%   30.0MB/82.5MB (9.4MB/s) Job-10312797222738272612269072.csv     
Downloading  [########------------]38.77%   32.0MB/82.5MB (9.5MB/s) Job-10312797222738272612269072.csv     
Downloading  [########------------]41.19%   34.0MB/82.5MB (9.5MB/s) Job-10312797222738272612269072.csv     
Downloading  [#########-----------]43.61%   36.0MB/82.5MB (9.6MB/s) Job-10312797222738272612269072.csv     
Downloading  [#########-----------]46.03%   38.0MB/82.5MB (9.6MB/s) Job-10312797222738272612269072.csv     
Downloading  [##########----------]48.46%   40.0MB/82.5MB (9.7MB/s) Job-10312797222738272612269072.csv     
Downloading  [##########----------]50.88%   42.0MB/82.5MB (9.8MB/s) Job-10312797222738272612269072.csv     
Downloading  [###########---------]53.30%   44.0MB/82.5MB (9.8MB/s) Job-10312797222738272612269072.csv     
Downloading  [###########---------]55.73%   46.0MB/82.5MB (9.9MB/s) Job-10312797222738272612269072.csv     
Downloading  [############--------]58.15%   48.0MB/82.5MB (10.0MB/s) Job-10312797222738272612269072.csv     
Downloading  [############--------]60.57%   50.0MB/82.5MB (10.1MB/s) Job-10312797222738272612269072.csv     
Downloading  [#############-------]62.99%   52.0MB/82.5MB (10.1MB/s) Job-10312797222738272612269072.csv     
Downloading  [#############-------]65.42%   54.0MB/82.5MB (10.0MB/s) Job-10312797222738272612269072.csv     
Downloading  [##############------]67.84%   56.0MB/82.5MB (10.1MB/s) Job-10312797222738272612269072.csv     
Downloading  [##############------]70.26%   58.0MB/82.5MB (10.2MB/s) Job-10312797222738272612269072.csv     
Downloading  [###############-----]72.69%   60.0MB/82.5MB (10.2MB/s) Job-10312797222738272612269072.csv     
Downloading  [###############-----]75.11%   62.0MB/82.5MB (10.3MB/s) Job-10312797222738272612269072.csv     
Downloading  [################----]77.53%   64.0MB/82.5MB (10.3MB/s) Job-10312797222738272612269072.csv     
Downloading  [################----]79.95%   66.0MB/82.5MB (10.3MB/s) Job-10312797222738272612269072.csv     
Downloading  [################----]82.38%   68.0MB/82.5MB (10.3MB/s) Job-10312797222738272612269072.csv     
Downloading  [#################---]84.80%   70.0MB/82.5MB (10.4MB/s) Job-10312797222738272612269072.csv     
Downloading  [#################---]87.22%   72.0MB/82.5MB (10.4MB/s) Job-10312797222738272612269072.csv     
Downloading  [##################--]89.65%   74.0MB/82.5MB (10.5MB/s) Job-10312797222738272612269072.csv     
Downloading  [##################--]92.07%   76.0MB/82.5MB (10.5MB/s) Job-10312797222738272612269072.csv     
Downloading  [###################-]94.49%   78.0MB/82.5MB (10.5MB/s) Job-10312797222738272612269072.csv     
Downloading  [###################-]96.91%   80.0MB/82.5MB (10.5MB/s) Job-10312797222738272612269072.csv     
Downloading  [####################]99.34%   82.0MB/82.5MB (10.5MB/s) Job-10312797222738272612269072.csv     
Downloading  [####################]100.00%   82.5MB/82.5MB (10.6MB/s) Job-10312797222738272612269072.csv Done...
all.vars<-rbind(select(wgs.vars,'Hugo_Symbol','Protein_position','specimenID','IMPACT','gnomAD_AF'),
    select(exome.vars,'Hugo_Symbol','Protein_position','specimenID','IMPACT','gnomAD_AF'))%>%
  subset(gnomAD_AF<0.01)


top.lvs<-synTableQuery("SELECT * from syn21318452")$asDataFrame()
## 
 [####################]100.00%   1/1   Done...    
Downloading  [####################]100.00%   3.7kB/3.7kB (2.7MB/s) Job-103127996862624549364057476.csv Done...
mp_res<-synTableQuery("SELECT * FROM syn21046991")$asDataFrame()%>%
  filter(isCellLine != "TRUE")%>%
  subset(latent_var%in%top.lvs$LatentVar)%>%
  select(latent_var,id,value,specimenID,tumorType,modelOf,diagnosis)
## 
Building the CSV... [######--------------]31.77%   47674/150072       
Building the CSV... [####################]100.00%   150072/150072   Done...    
Downloading  [##------------------]7.96%   2.0MB/25.1MB (1.5MB/s) Job-103128005765236745870494946.csv     
Downloading  [###-----------------]15.92%   4.0MB/25.1MB (2.0MB/s) Job-103128005765236745870494946.csv     
Downloading  [#####---------------]23.88%   6.0MB/25.1MB (2.4MB/s) Job-103128005765236745870494946.csv     
Downloading  [######--------------]31.85%   8.0MB/25.1MB (2.8MB/s) Job-103128005765236745870494946.csv     
Downloading  [########------------]39.81%   10.0MB/25.1MB (3.2MB/s) Job-103128005765236745870494946.csv     
Downloading  [##########----------]47.77%   12.0MB/25.1MB (3.6MB/s) Job-103128005765236745870494946.csv     
Downloading  [###########---------]55.73%   14.0MB/25.1MB (3.9MB/s) Job-103128005765236745870494946.csv     
Downloading  [#############-------]63.69%   16.0MB/25.1MB (4.2MB/s) Job-103128005765236745870494946.csv     
Downloading  [##############------]71.65%   18.0MB/25.1MB (4.5MB/s) Job-103128005765236745870494946.csv     
Downloading  [################----]79.61%   20.0MB/25.1MB (4.8MB/s) Job-103128005765236745870494946.csv     
Downloading  [##################--]87.58%   22.0MB/25.1MB (5.1MB/s) Job-103128005765236745870494946.csv     
Downloading  [###################-]95.54%   24.0MB/25.1MB (5.3MB/s) Job-103128005765236745870494946.csv     
Downloading  [####################]100.00%   25.1MB/25.1MB (5.4MB/s) Job-103128005765236745870494946.csv Done...

Merge data together

For the purposes of this analysis we want to have only those samples wtih genomic data and only those latent variables that are highly variable.

samps<-intersect(mp_res$specimenID,all.vars$specimenID)

mp_res<-mp_res%>%
  subset(specimenID%in%samps)#%>%
#  group_by(latent_var) %>%
#  mutate(sd_value = sd(value)) %>%
#  filter(sd_value > 0.025) %>%
#  ungroup()

Retrieve Variant Data

Let’s retrieve the LV data and evaluate any correlations between scores and tumor size or patient age

data.with.var<-mp_res%>%subset(specimenID%in%samps)%>%
  left_join(all.vars,by='specimenID')

tab<-subset(data.with.var,!tumorType%in%c('Other','High Grade Glioma','Low Grade Glioma'))

top.genes=tab%>%#group_by(tumorType)%>%
  mutate(numSamps=n_distinct(specimenID))%>%
      group_by(Hugo_Symbol)%>%
    mutate(numMutated=n_distinct(specimenID))%>%
    ungroup()%>%
  subset(numMutated>2)%>%
      subset(numMutated<(numSamps-2))%>%
  select(tumorType,Hugo_Symbol,numSamps,numMutated)%>%distinct()

gene.count=top.genes%>%group_by(tumorType)%>%mutate(numGenes=n_distinct(Hugo_Symbol))%>%select(tumorType,numGenes)%>%distinct()

DT::datatable(gene.count)

## Test significance of each gene/immune population

Now we can loop through every tumor type and gene

red.genes<-c("NF1","SUZ12","CDKN2A","EED")##for testing

##first spread the WT/Mutated values
vals<-tab%>%subset(Hugo_Symbol%in%top.genes$Hugo_Symbol)%>%
    mutate(mutated=ifelse(is.na(IMPACT),'WT','Mutated'))%>%
  select(latent_var,tumorType,value,Hugo_Symbol,specimenID,mutated)%>%
  distinct()%>%
  spread(key=Hugo_Symbol,value='mutated',fill='WT')

##double check to make sure there are both mutated and unmutated values
counts<-vals%>%
  gather(key=gene,value=status,-c(latent_var,tumorType,value,specimenID))%>% 
    select(latent_var,tumorType,value,gene,specimenID,status)%>%
    group_by(latent_var,gene)%>%
    mutate(numVals=n_distinct(status))%>%
    mutate(numSamps=n_distinct(specimenID))%>%
    subset(numVals==2)%>%ungroup()

#so now we have only 
with.sig<-counts%>%ungroup()%>%#subset(gene%in%top.genes$Hugo_Symbol)%>%
    group_by(latent_var,gene)%>%
  mutate(pval=t.test(value~status)$p.value)%>%ungroup()%>%
  group_by(latent_var)%>%
  mutate(corP=p.adjust(pval))%>%ungroup()%>%
  select(latent_var,gene,pval,corP)%>%distinct()

sig.vals<-subset(with.sig,corP<0.01)

DT::datatable(sig.vals)

Interesting! Some genes actually pass p-value correction. What do they look like? Here let’s write the messiest possible code to print.

for(ct in unique(sig.vals$latent_var)){
    tplot<-sig.vals[which(sig.vals$latent_var==ct),]
    if(nrow(tplot)==0)
      next
    
    print(ct)
    sigs=tplot%>%rowwise()%>%mutate(vals=paste(gene,format(corP,digits=3),sep=':'))%>%select(vals)%>%unlist()%>%paste(collapse=',')
    print(sigs)
        p<-counts%>%
    subset(latent_var==ct)%>%
    subset(gene%in%tplot$gene)%>%
    ggplot(aes(x=gene,y=value,col=status))+
    geom_boxplot(outlier.shape=NA)+
    geom_point(position=position_jitterdodge(),aes(shape=tumorType,group=status))+
    theme(axis.text.x = element_text(angle = 90, hjust = 1))+
    ggtitle(paste(ct,'scores\n',sigs))
#    if(method=='cibersort')
#      p<-p+scale_y_log10()
    print(p)
  }
## [1] "451,REACTOME_MITOCHONDRIAL_PROTEIN_IMPORT"
## [1] "AAAS:1.74e-05,AADAC:1.74e-05,AFAP1:1.74e-05,AIPL1:1.74e-05,AL050302.1:7.56e-06,ANKRD20A1:7.56e-06,ANKRD34C:1.74e-05,ANKRD36:9.01e-05,ANKRD36C:6.71e-05,ANP32B:9.8e-06,APOM:1.74e-05,AQP7:2.74e-05,ARHGEF33:1.74e-05,ARHGEF39:1.74e-05,ATOH1:1.74e-05,B4GALT7:1.74e-05,BIK:1.74e-05,C11orf86:1.74e-05,C19orf54:1.74e-05,C1orf174:1.74e-05,CA7:1.74e-05,CACNA1C:1.74e-05,CAPN7:1.74e-05,CATSPERG:1.74e-05,CCDC144NL:6.48e-05,CCDC64B:1.74e-05,CD300LD:1.74e-05,CDC27:7.56e-06,CDH9:1.74e-05,CETN1:1.74e-05,CHD1L:1.74e-05,CHRNB2:1.74e-05,CLSTN1:1.74e-05,CMTM3:1.74e-05,CNN2:0.00246,CPB1:1.74e-05,CTBP2:6.15e-05,CTDSP2:7.56e-06,CTPS2:1.74e-05,CTTN:1.74e-05,CUZD1:1.74e-05,DBN1:1.74e-05,DHDH:1.74e-05,EIF4G3:1.74e-05,ENO1:1.74e-05,FAM104B:7.56e-06,FAM182B:7.56e-06,FAM227B:8.47e-08,FAM83B:1.74e-05,FGD2:1.74e-05,FRG1B:7.56e-06,FUT2:0.00542,GGT1:3.55e-05,GJA10:1.74e-05,GLG1:1.74e-05,GNB1L:0.00125,GUCY2C:1.74e-05,HEYL:1.74e-05,HK3:1.74e-05,HNRNPCL1:7.56e-06,HYDIN:9.3e-06,IGHG4:1.74e-05,IGLV2-8:1.74e-05,IGSF3:7.56e-06,IL18:1.74e-05,INO80D:1.74e-05,INSRR:1.74e-05,ITGB7:1.74e-05,ITIH6:1.74e-05,KAZN:1.74e-05,KCTD16:1.74e-05,KIAA0391:1.74e-05,KIAA1239:1.74e-05,KIF18A:1.74e-05,KRT18:0.00578,KRTAP5-4:8.47e-08,KRTCAP3:1.74e-05,LEPROT:1.74e-05,LRP2BP:1.74e-05,LRRC10:1.74e-05,LRRIQ4:1.74e-05,LY6G6C:1.74e-05,LY6H:1.74e-05,MUC12:0.0011,MUC3A:4.89e-06,MUC6:0.000667,MYO1C:1.74e-05,MYRIP:1.74e-05,NPIPB11:9.68e-08,NPLOC4:1.74e-05,OLIG1:1.74e-05,OR10H3:1.74e-05,OR2M5:1.74e-05,OR4C5:4.06e-05,OR5B17:1.74e-05,OR8U1:6.4e-05,P2RX6:1.74e-05,PABPC3:0.00439,PCDHB14:1.74e-05,PDYN:1.74e-05,PDZD7:1.74e-05,PGA3:1.74e-05,PGM5:1.74e-05,POTED:0.000853,PRAMEF2:2.35e-05,PRPH:1.74e-05,PRRG2:1.74e-05,PRSS1:0.00485,PRSS3:7.56e-06,PTK2B:1.74e-05,PXN:1.74e-05,QSOX2:1.74e-05,RACGAP1:1.74e-05,RAD21L1:1.74e-05,RP9:1.74e-05,S1PR3:1.74e-05,SEL1L3:1.74e-05,SHOX:1.74e-05,SLC25A5:7.56e-06,SPAST:1.74e-05,SPESP1:1.74e-05,SPINT2:1.74e-05,STK32A:1.74e-05,TAS2R19:7.56e-06,TAS2R31:7.56e-06,TAS2R43:0.000195,THEM6:1.74e-05,TMEM168:1.74e-05,TMEM2:0.00291,TMEM53:1.74e-05,TNFRSF4:1.74e-05,TNFRSF6B:1.74e-05,TRBV10-1:7.56e-06,TRBV7-3:7.56e-06,TRMT2B:1.74e-05,TTBK1:1.74e-05,TXNDC17:1.74e-05,TYR:1.74e-05,UBXN7:1.74e-05,UMPS:1.74e-05,UNC5CL:1.74e-05,UPF2:1.74e-05,ZDHHC7:1.74e-05,ZMYM3:1.74e-05,ZNF717:7.56e-06,ZNF76:0.00521"

## [1] "LV 420"
## [1] "ABCB1:0.00153,ANKRD7:0.00153,AP001024.1:0.00153,ATXN2:0.000491,C16orf3:0.00153,C17orf53:0.00153,C9orf131:0.00153,CC2D1A:0.00153,CCDC169-SOHLH2:0.00153,CD209:0.00775,CEACAM5:0.000217,CNKSR1:0.00153,COL4A3BP:0.00425,DCTN1:0.00741,DENND4A:0.00153,DNASE2B:0.00153,FAM181B:0.00153,FANCC:0.00153,FBP1:0.00153,GCOM1:0.00153,HDAC9:0.00153,HUS1B:0.00153,IFNAR2:0.00425,IFNGR2:0.00153,INTS4:0.00153,KRT81:0.00153,KRTAP4-6:0.000491,MSH6:0.00741,MYOF:0.00425,NCSTN:0.00153,NEK11:0.00153,NOL12:0.00153,OR51G1:0.00153,PARPBP:0.00153,PLXNC1:0.00153,PMF1-BGLAP:0.00153,PPP1R37:0.00153,PRKCSH:0.00741,PYGM:0.00153,RAD54B:0.00153,RAMP3:0.00153,SLC35D2:0.00153,SMARCC2:0.000491,SNX21:0.00153,TADA2A:0.00153,TTI2:0.000503,UPF3A:0.00153,ZNF157:0.00153,ZNF37A:0.00153,ZNF532:0.00741,ZNF541:0.00153,ZNF90:0.00153"

## [1] "LV 849"
## [1] "ABCC12:0.000486,ADAMTS20:2.72e-06,AL050302.1:6.41e-05,ANKRD20A1:6.41e-05,ANKRD36:0.00737,ANP32B:0.00258,ART1:0.000442,ART3:0.00109,ASB18:4.43e-06,ATAD3C:0.00643,ATP10B:2.72e-06,CCDC144NL:0.00162,CDC27:6.41e-05,CLDN8:0.00941,CSGALNACT1:0.00514,CTBP2:0.00241,CTDSP2:6.41e-05,DAB2IP:2.72e-06,DIP2A:2.72e-06,EFCAB12:0.00514,FAM104B:6.41e-05,FAM182B:6.41e-05,FAM212A:0.00941,FRG1B:6.41e-05,GGT1:0.00171,GRIK4:2.72e-06,GSDMB:2.72e-06,GSDMD:2.72e-06,GTF3A:0.000706,HNRNPCL1:6.41e-05,HYDIN:0.00163,IGSF3:6.41e-05,ILDR1:0.00643,INMT:2.72e-06,KCNK5:2.72e-06,KCTD8:0.00155,KIAA1211:2.72e-06,MUC15:0.000486,MUC3A:0.00142,NPIPB15:0.00138,NR5A2:0.00643,OMA1:0.00342,OR4M2:0.00893,OR5AU1:2.72e-06,PCBP4:0.00342,PHRF1:0.000488,PLA2G4C:0.00643,POTED:0.00164,PRKCQ:0.00739,PRSS3:6.41e-05,RHOT2:2.72e-06,RP11-766F14.2:8.05e-05,SEC14L4:0.00833,SERPINB11:0.00941,SLC25A5:6.41e-05,TAS2R19:6.41e-05,TAS2R31:6.41e-05,TBC1D31:2.72e-06,TCF20:8.21e-06,TRBV10-1:6.41e-05,TRBV7-3:6.41e-05,UMODL1:9.65e-05,USP19:0.00941,USP4:0.00941,USP43:0.000752,VPS13C:9.57e-07,ZNF717:6.41e-05"

## [1] "LV 185"
## [1] "ABCF1:5.94e-05,AL050302.1:0.00049,ANKRD20A1:0.00049,ANKRD36:0.00683,ANP32B:0.00125,CCDC144NL:8.56e-05,CDC27:0.00049,CTBP2:0.00256,CTDSP2:0.00049,FAM104B:0.00049,FAM182B:0.00049,FRG1B:0.00049,GGT1:0.00203,HNRNPCL1:0.00049,HYDIN:0.00235,IGSF3:0.00049,MDM1:3.29e-08,MUC3A:0.00225,OR8U1:0.000143,POTED:0.00375,PRAMEF2:0.000796,PRSS3:0.00049,SLC25A5:0.00049,TAS2R19:0.00049,TAS2R31:0.00049,TENM3:1.85e-08,TRBV10-1:0.00049,TRBV7-3:0.00049,ZNF717:0.00049"

## [1] "LV 521"
## [1] "AC187652.1:0.00184"

## [1] "1,REACTOME_MRNA_SPLICING"
## [1] "ADAMTSL3:0.00156,AIRE:0.00156,ANKRD18A:0.00156,ANKRD31:0.00139,APBB1IP:0.00224,CEP350:0.00224,CEP85L:0.00156,COL5A2:0.000439,CORO7:0.00156,CPNE3:0.000463,CPS1:0.000463,DDI1:0.0041,DUSP15:0.00446,FAM175A:0.0069,FDXR:0.00228,GLI2:0.000536,HIBCH:0.00722,KLHL38:0.00156,KRTAP4-16P:0.00446,NUMBL:0.00602,NUP133:0.000463,OR2S2:0.00156,PPFIA4:0.000834,PRRC2A:0.000463,REEP3:0.00139,RESP18:0.00156,RIN1:0.00141,SHC2:0.00156,SLC1A7:0.0022,SLC7A14:0.00321,TAS1R1:0.00157,TICRR:0.00409,WDR76:0.00139,ZNF862:0.00156,ZNF93:0.0046"

## [1] "4,REACTOME_NEURONAL_SYSTEM"
## [1] "AL050302.1:0.000757,ANKRD20A1:0.000757,AQP7:0.00183,C1orf51:5.16e-08,CCDC144NL:0.00138,CDC27:0.000757,CTBP2:0.000824,CTDSP2:0.000757,FAM104B:0.000757,FAM182B:0.000757,FRG1B:0.000757,GGT1:0.00182,HNRNPCL1:0.000757,HYDIN:0.00706,IGSF3:0.000757,MUC3A:0.00801,POTED:0.00611,PRSS1:0.000749,PRSS3:0.000757,SLC25A5:0.000757,TAS2R19:0.000757,TAS2R31:0.000757,TRBV10-1:0.000757,TRBV7-3:0.000757,UNC13D:8.26e-07,ZNF717:0.000757"

## [1] "LV 308"
## [1] "AL050302.1:3.26e-05,ANKRD20A1:3.26e-05,ANKRD36:0.000713,ANKRD36C:0.00888,ANP32B:0.000242,AQP7:6.1e-05,CCDC144NL:0.000906,CDC27:3.26e-05,CTBP2:9.9e-05,CTDSP2:3.26e-05,EPOR:0.00094,FAM104B:3.26e-05,FAM170A:2.64e-06,FAM182B:3.26e-05,FRG1B:3.26e-05,GGT1:0.00417,HNRNPCL1:3.26e-05,HYDIN:0.000608,IGSF3:3.26e-05,MUC12:0.000899,MUC3A:0.00123,MUC6:0.00358,OR4C5:0.000147,OR8U1:0.00499,POTED:0.0031,PRSS1:0.00173,PRSS3:3.26e-05,RP11-231C14.4:2.26e-06,SLC22A31:2.76e-05,SLC25A5:3.26e-05,TAS2R19:3.26e-05,TAS2R31:3.26e-05,TRBV10-1:3.26e-05,TRBV7-3:3.26e-05,UBR4:1.57e-06,ZNF717:3.26e-05"

## [1] "LV 442"
## [1] "AL050302.1:0.00172,ANKRD20A1:0.00172,ANP32B:0.00198,CDC27:0.00172,CTBP2:0.00275,CTDSP2:0.00172,FAM104B:0.00172,FAM175A:0.00114,FAM182B:0.00172,FRG1B:0.00172,GGT1:0.00149,GYS2:0.000896,HNRNPCL1:0.00172,IGSF3:0.00172,KLHL41:0.000412,LRRC17:6e-04,POTED:0.00487,PRSS3:0.00172,SLC1A7:0.0019,SLC25A5:0.00172,SNX19:0.000961,TAS2R19:0.00172,TAS2R31:0.00172,TRBV10-1:0.00172,TRBV7-3:0.00172,ZNF598:0.00411,ZNF717:0.00172,ZNF721:0.000223"

## [1] "LV 445"
## [1] "AL050302.1:0.000145,ANKRD20A1:0.000145,ANKRD36:0.00479,ANP32B:0.0015,AQP7:0.00235,CCDC144NL:0.00471,CDC27:0.000145,CNN2:0.00872,CTBP2:0.0027,CTDSP2:0.000145,FAM104B:0.000145,FAM182B:0.000145,FRG1B:0.000145,GGT1:0.000692,HNRNPCL1:0.000145,HYDIN:0.00123,IGSF3:0.000145,MUC3A:0.000441,NPIPB15:0.00143,POTED:0.000884,PRSS3:0.000145,SLC25A5:0.000145,TAS2R19:0.000145,TAS2R31:0.000145,TRBV10-1:0.000145,TRBV7-3:0.000145,ZNF717:0.000145"

## [1] "LV 492"
## [1] "AL050302.1:0.00993,ANKRD20A1:0.00993,AQP7:0.00212,CDC27:0.00993,CTDSP2:0.00993,FAM104B:0.00993,FAM182B:0.00993,FRG1B:0.00993,HNRNPCL1:0.00993,HYDIN:0.00264,IGSF3:0.00993,MUC12:0.000115,MUC6:0.000174,OR4C5:1.57e-05,PRSS3:0.00993,SLC25A5:0.00993,TAS2R19:0.00993,TAS2R31:0.00993,TRBV10-1:0.00993,TRBV7-3:0.00993,ZNF717:0.00993"

## [1] "LV 635"
## [1] "AL050302.1:0.00205,ANKRD20A1:0.00205,ANP32B:0.00788,AQP7:0.00366,CCDC144NL:0.00243,CDC27:0.00205,CTBP2:0.00805,CTDSP2:0.00205,FAM104B:0.00205,FAM182B:0.00205,FRG1B:0.00205,HNRNPCL1:0.00205,HYDIN:0.00219,IGSF3:0.00205,MUC12:0.00313,MUC6:0.0027,OR4C5:0.000601,OR8U1:0.00912,PRSS3:0.00205,SLC25A5:0.00205,TAS2R19:0.00205,TAS2R31:0.00205,TRBV10-1:0.00205,TRBV7-3:0.00205,ZNF717:0.00205,ZNF93:0.00864"

## [1] "LV 644"
## [1] "AL050302.1:0.000631,ANKRD20A1:0.000631,AQP7:0.00708,CCDC144NL:0.00215,CDC27:0.000631,CTBP2:0.000513,CTDSP2:0.000631,FAM104B:0.000631,FAM182B:0.000631,FRG1B:0.000631,GNB1L:3.15e-05,HNRNPCL1:0.000631,HYDIN:0.00104,IGSF3:0.000631,MUC3A:0.00277,PIGT:0.0049,POTED:0.0032,PRSS3:0.000631,SLC25A5:0.000631,TAS2R19:0.000631,TAS2R31:0.000631,TRBV10-1:0.000631,TRBV7-3:0.000631,ZNF717:0.000631,ZYX:0.000334"

## [1] "LV 653"
## [1] "AL050302.1:0.000269,ANKRD20A1:0.000269,ANP32B:0.00283,AQP7:0.00832,CCDC144NL:3.06e-05,CDC27:0.000269,COL5A2:1.39e-05,CPNE3:5.52e-05,CPS1:5.52e-05,CTBP2:0.00117,CTDSP2:0.000269,DUSP15:0.00081,FAM104B:0.000269,FAM182B:0.000269,FRG1B:0.000269,GBP4:0.000524,GGT1:0.00669,HNRNPCL1:0.000269,HYDIN:0.00079,IGSF3:0.000269,KRTAP4-16P:0.00081,MUC3A:0.00772,NUP133:5.52e-05,OR4C5:0.0083,OR8U1:0.000311,POTED:0.00235,PRAMEF2:0.00554,PRRC2A:5.52e-05,PRSS3:0.000269,SLC25A5:0.000269,SLC7A14:0.00944,TAS2R19:0.000269,TAS2R31:0.000269,TRBV10-1:0.000269,TRBV7-3:0.000269,ZNF646:0.00113,ZNF717:0.000269,ZNF93:0.00517"

## [1] "LV 665"
## [1] "AL050302.1:5.69e-05,ANKRD20A1:5.69e-05,ANKRD36:0.00745,ANP32B:0.000702,AQP7:0.0086,C16orf71:0.00116,CCDC144NL:0.000233,CD248:0.00118,CDC27:5.69e-05,CTBP2:0.000824,CTDSP2:5.69e-05,DACT2:0.00184,DHODH:0.000584,FAM104B:5.69e-05,FAM182B:5.69e-05,FRG1B:5.69e-05,GGT1:0.00106,GRIP1:0.00116,HNRNPCL1:5.69e-05,HYDIN:0.000358,IGSF3:5.69e-05,KIF26A:0.00194,KIR3DL2:0.00167,LRIG2:0.00149,MUC3A:0.000762,NPIPB15:0.000987,OR8U1:0.00302,PHC3:0.000555,POTED:0.000501,PRSS3:5.69e-05,SH3RF3:0.00575,SLC25A5:5.69e-05,SLC3A1:0.00118,ST6GAL2:0.000467,TAS2R19:5.69e-05,TAS2R31:5.69e-05,TNFRSF21:0.00116,TPBGL:0.00947,TRBV10-1:5.69e-05,TRBV7-3:5.69e-05,TYRP1:0.00116,ZNF717:5.69e-05,ZNF843:0.00116"

## [1] "LV 72"
## [1] "AL050302.1:0.00233,ANKRD20A1:0.00233,CDC27:0.00233,CTDSP2:0.00233,FAM104B:0.00233,FAM182B:0.00233,FRG1B:0.00233,GGT1:0.00674,HNRNPCL1:0.00233,HYDIN:0.00256,IGSF3:0.00233,MUC3A:0.00195,PRSS3:0.00233,SLC25A5:0.00233,TAS2R19:0.00233,TAS2R31:0.00233,TRBV10-1:0.00233,TRBV7-3:0.00233,ZNF717:0.00233"

## [1] "LV 851"
## [1] "AL050302.1:1.64e-06,ANKRD20A1:1.64e-06,ANKRD36:0.000272,ANKRD36C:0.00542,ANP32B:5.83e-05,AQP7:0.000224,C1orf51:0.000371,CCDC144NL:2.77e-05,CDC27:1.64e-06,CENPJ:0.000272,CTBP2:5.1e-06,CTDSP2:1.64e-06,FAM104B:1.64e-06,FAM182B:1.64e-06,FLYWCH1:0.000272,FRG1B:1.64e-06,FTSJ3:0.000465,GGT1:1.02e-05,GLI2:0.00733,HNRNPCL1:1.64e-06,HYDIN:6.5e-05,IGSF3:1.64e-06,MUC3A:2.07e-05,NHS:0.000465,NPIPB15:0.00853,OR8U1:0.000698,POTED:2.41e-05,PRAMEF2:0.0049,PRSS1:0.00982,PRSS3:1.64e-06,RNH1:0.000272,SLC25A5:1.64e-06,TAS2R19:1.64e-06,TAS2R31:1.64e-06,TAS2R43:0.000546,TRBV10-1:1.64e-06,TRBV7-3:1.64e-06,ZNF717:1.64e-06"

## [1] "LV 864"
## [1] "AL050302.1:0.00449,ANKRD20A1:0.00449,CDC27:0.00449,CTDSP2:0.00449,FAM104B:0.00449,FAM182B:0.00449,FRG1B:0.00449,HNRNPCL1:0.00449,IGSF3:0.00449,PRSS3:0.00449,SLC25A5:0.00449,TAS2R19:0.00449,TAS2R31:0.00449,TRBV10-1:0.00449,TRBV7-3:0.00449,ZNF717:0.00449"

## [1] "LV 984"
## [1] "AL050302.1:0.000112,ANKRD20A1:0.000112,ANKRD36:0.00126,ANKRD36C:0.00275,ANP32B:0.000555,AQP7:0.000454,CCDC144NL:0.000225,CDC27:0.000112,CTBP2:0.00148,CTDSP2:0.000112,FAM104B:0.000112,FAM182B:0.000112,FIP1L1:0.00942,FRG1B:0.000112,GBP4:0.00256,GGT1:0.000187,HNRNPCL1:0.000112,HYDIN:0.000403,IGSF3:0.000112,LRIG1:0.00548,MUC12:0.00375,MUC3A:0.000294,MUC6:0.00415,NPIPB15:0.0051,OR4C5:0.00183,OR8U1:0.000843,PABPN1L:0.00949,POTED:0.000446,PRAMEF2:0.0014,PRSS3:0.000112,SLC25A5:0.000112,TAS2R19:0.000112,TAS2R31:0.000112,TAS2R43:0.00232,TRBV10-1:0.000112,TRBV7-3:0.000112,ZNF717:0.000112"

## [1] "LV 751"
## [1] "ANKRD31:0.0019,MFSD2B:0.00091,REEP3:0.0019,WDR76:0.0019"

## [1] "LV 380"
## [1] "ASTN2:0.00272,ATXN7L1:0.00727,MCM5:8.02e-05,MS4A15:0.00035,MUC16:0.000557,MUC2:0.00241,MUC6:6.79e-06,OR6N2:0.000133,ZNF780B:0.00035"

## [1] "LV 816"
## [1] "ATP6AP1:8.33e-07,NDUFC2:0.00142"

## [1] "985,IRIS_Neutrophil-Resting"
## [1] "C13orf35:0.000985,C16orf71:0.0019,C17orf99:0.000985,CERS3:0.000985,GRIP1:0.0019,OR7G3:0.000985,PHC3:0.00315,PODXL:0.000985,PODXL2:0.000985,SLC37A2:0.000985,SPATC1L:0.000985,TNFRSF21:0.0019,TPBGL:0.00101,TYRP1:0.0019,ZCCHC2:0.000985,ZNF510:0.000985,ZNF843:0.0019"

## [1] "LV 957"
## [1] "C1orf51:3.16e-05,UNC13D:0.00809"

## [1] "LV 690"
## [1] "C9orf129:0.00651,CAMSAP1:0.000761,CD248:1.4e-05,IL4I1:0.00235,KIAA0586:0.000225,SLC3A1:1.4e-05"

## [1] "LV 229"
## [1] "CCDC144NL:0.000833,COL5A2:0.00789,OR8U1:0.00722"

## [1] "LV 379"
## [1] "CCDC144NL:0.000938,OR8U1:0.00127"

## [1] "LV 917"
## [1] "CD248:0.00387,FDXR:0.00431,MUC6:0.00556,SLC3A1:0.00387,TICAM1:0.00109,ZNF646:2.19e-05"

## [1] "LV 533"
## [1] "CDC42BPG:0.00228,MUC3A:0.00576,STYK1:0.00228"

## [1] "LV 496"
## [1] "CEP85:0.00339"

## [1] "LV 9"
## [1] "COL5A2:0.00796,FDXR:0.00728,PHF3:0.00615,SCAMP3:0.00713,ZBED6:0.00749"

## [1] "13,REACTOME_GLUCOSE_METABOLISM"
## [1] "CTD-3193O13.9:0.00833,GPR114:0.00833,KIF26A:0.000528,PPP6R2:0.00833,SPG7:0.00865"

## [1] "LV 272"
## [1] "DACT2:0.00183,SH3RF3:0.00158"

## [1] "LV 32"
## [1] "DCBLD2:0.000323,GYLTL1B:2.17e-05,MUC16:0.00305,PPP1R18:9.69e-06,SH3RF3:2.75e-05"

## [1] "LV 484"
## [1] "DGKG:0.00371,EPHX2:0.00371,GTF3A:0.00951,PPP1R13B:0.00422,SDF4:0.00855"

## [1] "LV 303"
## [1] "DYTN:0.00793,SH3RF3:0.00975,ZNF646:1.32e-06"

## [1] "LV 520"
## [1] "GZMA:1.21e-05,TDRD6:0.000135"

## [1] "928,DMAP_ERY3"
## [1] "HBQ1:0.00307,OR8B12:0.00278,VWA3A:0.00278"

## [1] "31,SVM B cells naive"
## [1] "HEXDC:5.59e-05,NT5DC4:5.03e-05,PP2D1:2.9e-05,SERPINA1:0.00252,SSTR5:0.000356,ST6GAL2:2.73e-06"

## [1] "827,KEGG_B_CELL_RECEPTOR_SIGNALING_PATHWAY"
## [1] "HSF4:3.1e-10"

## [1] "LV 15"
## [1] "HSF4:1.49e-06"

## [1] "LV 88"
## [1] "KCTD8:3.44e-05"

## [1] "LV 94"
## [1] "KDM4D:0.0089"

## [1] "LV 418"
## [1] "KIAA0586:0.00388,RSF1:0.000254"

## [1] "LV 971"
## [1] "METTL17:0.000173"

## [1] "LV 625"
## [1] "MUC16:0.00817"

## [1] "953,IRIS_Monocyte-Day1"
## [1] "NEK5:6.34e-11,NETO2:6.34e-11"

## [1] "97,KEGG_ARACHIDONIC_ACID_METABOLISM"
## [1] "OR8B12:0.00261,VWA3A:0.00261"

## [1] "LV 100"
## [1] "PDHA2:0.000772"

## [1] "LV 624"
## [1] "RP11-231C14.4:0.00722,ZNF93:3.05e-06"

## [1] "LV 909"
## [1] "SCN9A:5.42e-05,UNC13D:0.000218"

## [1] "45,REACTOME_RNA_POL_I_PROMOTER_OPENING"
## [1] "SOX11:0.000229"

## [1] "767,SVM B cells naive"
## [1] "TPSD1:0.00676,ZNF462:0.000111"

## [1] "LV 376"
## [1] "ZNF93:3.97e-05"

#}

Breaking down by tumor type

At first glance it seems that a lot of these are separating out cNFs (i.e. mast cell signaling) from other types. However, I’m getting the same error I get in notebook number 11, so am unsure about how to proceed.

#this is a failed attempt to group by tumor type
#with.sig<-counts%>%ungroup()%>%subset(gene%in%top.genes$Hugo_Symbol)%>%
#    group_by(latent_var,tumorType,gene)%>%
#  mutate(pval=t.test(value~status)$p.value)%>%
#  ungroup()%>%
#  group_by(latent_var)%>%
#  mutate(corP=p.adjust(pval))%>%ungroup()%>%
#  select(latent_var,tumorType,gene,pval,corP)%>%distinct()

#sig.vals<-subset(with.sig,corP<0.05)

#DT::datatable(sig.vals)